Interpreting Multiple Correspondence Analysis as a Multidimensional Scaling Method

نویسندگان

  • DONNA L. HOFFMAN
  • JAN DE LEEUW
چکیده

We formulate multiple correspondence analysis (MCA) as a nonlinear multivariate analysis method that integrates ideas from multidimensional scaling. MCA is introduced as a graphical technique that minimizes distances between connecting points in a graph plot. We use this geometrical approach to show how questions posed of categorical marketing research data may be answered with MCA in terms of closeness. We introduce two new displays, the star plot and line plot, which help illustrate the primary geometric features of MCA and enhance interpretation. Out approach, which extends Gifi (1981, 1990), emphasizes easy-to-interpret and managerially relevant MCA maps. Multiple correspondence analysis (MCA) is weil on its way to becoming a popular tool in marketing research (Hoffman & Franke, 1986). For example, Green, Krieger, and Carroll (1987) use MCA to analyze the relationship between consumers' choice profile predictions from a conjoint task and consumer demographic characteristics. In a similar vein, Kaciak and Louviere (1990) illustrate how MCA may be used to analyze data from discrete choice experiments. Carroll and Green (1988) apply individual differences MDS to normalized Burt matrices (a principal data matrix in MCA) to determine the relationship between consumer demographics and automobile characteristics with respect to number of cars in the household. More recently, Valette-Florence and Rapacchi (1991) perform an MCA on the attributes-consequences-values matrix derived from a laddering task to construct a product positioning map and Hoffman and Batra (1991) apply MCA to study the association between television program types and audience viewing behaviors. *The authors thank J. Douglas Carroll, Don Lehmann, Donald Morrison, and two anonymous reviewers for their helpful comments on a previous version of this manuscript. 260 DONNA L. HOFFMAN AND JAN DE LEEUW We focus in this paper on the interpretation of MCA maps. The fundamental issue concerns the appropriate way to represent both the objects corresponding with the rows and variables corresponding with the columns of the data matrix in the same map. This problem has become increasingly more important because the three major statistical packages now have MCA modules in which the choice of scaling of row and column coordinates is left largely to the user (BMDP 1988; SAS Institute Inc. 1988; SPSS Inc. 1989). In addition, the variety of commercially available PC-based programs offer numerous options but little guidance to the user (BMDP 1988; Greenacre 1986; Nishisato and Nishisato 1986; SAS Institute Inc. 1988; Smith, 1988; see also Hoffman (1991) for a review). We present a geometrical approach to MCA that provides for enhanced representation and interpretation of MCA maps. Our work extends the treatment in Girl (1981, 1990) by placing increased emphasis on the geometry. Two new interpretive maps are introduced: star plots and the variable line plot. MCA is developed as an MDS method that minimizes the distances between connecting points in a graph plot. We feel that these additional geometric properties make MCA easier to understand. 1. Correspondence analysis as a model What is multiple correspondence analysis? The French literature (see, for example, Benzécri et al. 1973) discusses it in the context of metric multidimensional scaling suitable for frequency matrices, contingency tables, or cross-tables. Others formulate MCA as factorial analysis of qualitative data using scale analysis (e.g., Nishisato 1980) or principal component analysis (e.g., de Leeuw 1973) perspectives. We formulate MCA in terms of connecting objects, brands say, with all the variable categories they are in and use a least-squares loss function as the rule to do this. Then, interpretation sterns not from terms of chi-square distance or profiles (cf. Hoffman and Franke 1986), but rather, follows from le principe barycentrique, the centroid principle, which says that brands close to each other are similar to each other. Our approach thus emphasizes the geometrical aspects of multiple correspondence analysis. 2. M C A as an MDS method The concept of homogeneity serves as the basis for our development of multiple correspondence analysis. Homogeneity refers to the extent to which different variables measure the same characteristic or characteristics (Girl 1981, 1990). Homogeneity thus specifies a type of similarity. There are different measures of homogeneity and different approaches to find mäps; the particular choice of loss function defines the former and the specific algorithm employed determines the latter. INTERPRETING MULTIPLE CORRESPONDENCE ANALYS1S 261 Suppose we think of a rectangular data matrix as a multivariable representat ion (i.e., as a joint map of the brands and the variable categories) in two-dimensional Euclidean space. The map will be more appealing if brands are close to the categories of the variables that they occur in. This is the basic premise of multiple correspondence analysis. By the triangle inequality this implies that brands with similar profiles (i.e., brands that are offen in the same categories) will be close, and categories containing roughly the same brands will be close as well. We now formalize these ideas by defining a suitable loss function to be minimized. 2.1. Maximizing variable homogeneity Let the data be m categorical variables on n objects, with the jth variable taking on k i different values, its categories. We code the variables using indicator matrices to allow for easy expression in matrix notation. An indicator matri× is a binary matrix (exactly one element equal to one in each row) that indicates the category an object is in for a particular variable. Thus, if variable j has kj categories, then Gj, the indicator matrix for this variable is n × k~ and each row of Gj sums to one. More specifically, consider the example of G = [Gj] . . . IG,,] in table 1, with m = 4, n = 24, kl = 3, k2 = 5, k3 = 3, and k4 = 3. Here, the objects are 24 small cars that Consumers Union judged by degree of crash protect ion (Consumers Union, 1989). These judgments are based on Consumers Union's analysis of National Highway Traffic Safety Administration crash-test data. The two occupant protection variables indicate how well the car protected a driver dummy and a passenger dummy during crash tests. Structural integrity indicates how weil the passenger compar tment held up to the forces of a crash; bet ter performance is associated with a greater chance of avoiding injuries other than those caused by the immediate forces of a crash. The remaining categorical variable indicates car body style. The purpose of multiple correspondence analysis is to construct a jo in t map of cars and variable categories in such a way that a car is relatively close to a category it is in, and relatively far from the categories is it not in. By the triangle inequality, this implies that cars mostly occurring in the same categories rend to be close, while categories sharing mostly the same cars tend to be close, as well. The extent to which a particular scaling X of the cars and particular scalings Y« of the categories, satisfy this is quantified by the loss of homogeneity, a least squares loss function: ~r(X;Y, ..... Ym) = Zj S S Q ( X G f f ) (l) where SSQ(.) is shorthand for the sum of squares of the elements of a matrix or vector. The loss function in (1), giving the sum of squares of the distances between cars and the categories they occur in, measures departure from perfect homogeneity or similarity. Quite simply, muRiple correspondence analysis produces the map with the smallest possible loss (Girl 1981; 1990). 262 D O N N A L. H O F F M A N AND JAN DE LEEUV~ Table I. lndicator matrices constructed from C o n s u m e r Union's judgments on crash protection for 24 small cars Occupant protection b Car make Structural and model Body style" Driver Passenger integrity c 1 2 3 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 A c u r a l n t e g r a 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 Daiha t suCharade* 1 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 0 Dodge Colt 0 0 1 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 Eag leSummi t 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 Ford Escort* 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 FordFes t iva* 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 HondaCiv ic* 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 Hyunda iExce l* 1 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 Hyunda iExce l 0 1 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 I s u z u I M a r k 0 1 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 Mazda323* 1 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 MazdaRX-7* 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 MitsubishiMirage 0 1 0 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 Mitsubishi Starion* I 0 0 0 0 1 0 0 1 0 0 0 0 I 0 0 0 0 N i s s a n P u l s a r N X * 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 N i s s a n S e n t r a 0 1 0 0 0 0 0 1 0 1 0 0 0 0 ! 0 0 0 N i s s a n S e n t r a 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0 P lymouthCol t 0 0 1 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 Pon t i a cLeMans 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 S u b a r u J u s t y * 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 Toyota Celica 1 O0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 Toyota Tercel* I O0 O01 O0 1 0 0 0 0 1 0 0 0 0 VolkswagonGolf* 0 1 0 O01 O0 1 0 0 0 0 0 1 0 0 0 Yugo GV* I O0 0 0 0 0 1 1 0 0 0 0 O01 O0

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluating Visual Preferences of Architects and People Toward Housing Facades, Using Multidimensional Scaling Analysis (MDS)

One of the most important issues that have absorbed the public opinion and expert community during the recent years, is the qualitative and quantitative aspects of the housing. There are several challenges related to this topic that includes the contexts of the construction, manufacturing, planning to social aspects, cultural, physical and architectural design. The thing that has a significant ...

متن کامل

Perceptual maps: the good, the bad and the ugly

Perceptual maps are often used in marketing to visually study relations between two or more attributes. However, in many perceptual maps published in the recent literature it remains unclear what is being shown and how the relations between the points in the map can be interpreted or even what a point represents. The term perceptual map refers to plots obtained by a series of different techniqu...

متن کامل

Multiple Correspondence Analysis

Multiple correspondence analysis (MCA) is an extension of correspondence analysis (CA) which allows one to analyze the pattern of relationships of several categorical dependent variables. As such, it can also be seen as a generalization of principal component analysis when the variables to be analyzed are categorical instead of quantitative. Because MCA has been (re)discovered many times, equiv...

متن کامل

Graph Layout Techniques and Multidimensional Data Analysis

ABSTRACT. In this paper we explore the relationship between multivariate data analysis and techniques for graph drawing or graph layout. Although both classes of techniques were created for quite different purposes, we find many common principles and implementations. We start with a discussion of the data analysis techniques, in particular multiple correspondence analysis, multidimensional scal...

متن کامل

Metric scaling graphical representation of categorical data

Metric Scaling is a well{known method to represent a nite set with respect to a given Euclidean distance matrix. Several methods to represent rows and columns of a two{way contingency table are available: Correspondence Analysis, Dual Scaling, Canonical Coordinates, etc. We show that metric scaling provides a similar representation by using Hellinger or Rao distances together with Gower's add{a...

متن کامل

An ExPosition of multivariate analysis with the singular value decomposition in R

ExPosition is a new comprehensive R package providing crisp graphics and implementing multivariate analysis methods based on the singular value decomposition (svd). The core techniques implemented in ExPosition are: principal components analysis, (metric) multidimensional scaling, correspondence analysis, and several of their recent extensions such as barycentric discriminant analyses (e.g., di...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004